New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract string parsing to a separate package #14772
Extract string parsing to a separate package #14772
Conversation
Amazing! Lines 208 to 214 in 486d27a
Can you add it here? |
43cc94a
to
9971813
Compare
Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/52568/ |
@@ -20,7 +20,7 @@ | |||
"extra": { | |||
"rawValue": "\\u{}", | |||
"raw": "\"\\u{}\"", | |||
"expressionValue": "null" | |||
"expressionValue": "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new behaviour here is better than current main. Should we set expressionValue
to be null
when the string can not be successfully parsed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I admit I don't know why the behavior changed 😬
But yes, null
would be even better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, expressionValue
is a property of directives that represents the .value
property of the originally parsed StringLiteral
. I feat that making it null
would break anything relying on the fact that StringLiteral.value
is a string, so I prefer the behavior currently implemented by this PR of just replacing invalid escapes with ""
(for example, ab\u{}cd
is abcd
).
700d6f0
to
23418cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a lax benchmark with no significant performance penalty.
current 1 jquery 3.6: 40.11 ops/sec ±5.28% (25ms)
current 4 jquery 3.6: 9.5 ops/sec ±3.78% (105ms)
current 16 jquery 3.6: 2.22 ops/sec ±7.14% (451ms)
current 64 jquery 3.6: 0.54 ops/sec ±2.83% (1861ms)
baseline 1 jquery 3.6: 41.42 ops/sec ±5.53% (24ms)
baseline 4 jquery 3.6: 9.89 ops/sec ±2.5% (101ms)
baseline 16 jquery 3.6: 2.36 ops/sec ±5% (424ms)
baseline 64 jquery 3.6: 0.55 ops/sec ±9.05% (1830ms)
Good! I believe rewriting all the tokenizer like this could actually improve performance, because we would skip all the |
"devDependencies": { | ||
"charcodes": "^0.2.0" | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if @babel/helper-string-parser
will be bundled so asking to be on the safe side; shouldn't this be a normal dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a plugin that inlines charcodes.***
into the corresponding codepoints: https://github.com/xtuc/charcodes/blob/master/packages/babel-plugin-transform-charcodes/README.md
This was hard, but it lets us avoid an external dependency with some logic that we already have in #14757.
@liuxingbaoyu you should be able to do something like this:
This PR mostly moves code around, with one main change: instead of directly updating
this.state.{pos,curLine,lineStart}
, we keep track of them in variables and we update them only after returning from the helper parser functions.